Research Resources

‘Research Resources’ consist of the professional outcomes of <Age of Disgust, the Response of Humanities>
Agenda Research and the major related research resources of Korea and abroad.

Winogender 데이터 세트를 활용한 한국어 언어모델의 성 편향성 분석
  • 저자조은비, 이수빈, 송상헌
  • 발행처한국언어학회
  • 발행연도2024
  • 작성언어국문
  • 키워드Korea language model, gender bias, Winogender dataset, encoder model
  • 자료형태논문
  • 수록면언어 제49권 제1호 173 - 205 (33page)
5

조회

This paper investigates the impact of gender bias in occupation nouns and adjectives in Korean language models on the gender referred to by pronouns, utilizing the Winogender dataset. The experiment was conducted in three ways: Measuring surprisal scores in Korean and English using encoder models, and conducting experiments with a decoder model, namely is ChatGPT-4, to test its responses. The encoder models showed that, regardless of the gender bias differences in occupation nouns and adjectives, the male pronoun was more naturally used in sentences than the female pronoun. On the other hand, the decoder model detected gender bias especially in sentences containing adjectives. This result identifies the influence of gender imbalance in training data and the functional differences between the language generation model and the language comprehension model. This study suggests to construct a unique dataset that reflects the characteristics of Korean, in order to more effectively analyze gender bias in Korean language models.

목차
1. 서론
2. 배경
3. 데이터
4. 실험
5. 한국어 BERTs와 ChatGPT4 비교
6. 공개 라이브러리
7. 결론
참고문헌

자료 출처: DBPia

아카이빙 정보